AITopics | initial condition

Neural operators have achieved strong performance in learning solution operators of partial differential equations (PDEs), but their inherently continuous representations struggle to capture discontinuities and sharp transitions. Existing approaches typically approximate such features within continuous function spaces, often requiring increased model capacity and high-resolution data. In this work, we propose Cut-DeepONet, a two-stage training framework that explicitly models discontinuities while reducing learning complexity. Our approach reformulates the problem via a lifting strategy, partitioning the domain into smooth subregions while representing discontinuities as boundaries in a higher-dimensional space. This separation aligns the operator learning task with the inductive bias of neural networks and avoids directly approximating discontinuities. An additional network predicts input-dependent discontinuity locations for unseen inputs, which are then used to guide the neural operator in generating smooth components within each region. Experiments on benchmark PDEs show that Cut-DeepONet outperforms state-of-the-art methods, even when trained on low-resolution datasets. The method excels on problems with discontinuities and sharp transitions, while using fewer trainable parameters. Our results highlight the benefits of changing the representation of operator learning rather than increasing model complexity.

artificial intelligence, discontinuity, machine learning, (16 more...)

arXiv.org Machine Learning

2605.19823

Country: Europe (0.28)

Genre: Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models Supplementary material

Neural Information Processing SystemsMay-1-2026, 06:27:07 GMT

The appendix is organized into five sections as follows: 1. Appendix A derives the Volterra equation and proves the main result for the homogenized SGD (Theorem 1). 2. We show in Appendix B a heuristic derivation of the homogenized SGD approximation to the SDA class of algorithms on the least squares problem and we show that SGD and homogenized SGD are close under orthogonal invariance (Theorem 2). 3. We give in Appendix C a general overview of the analysis of a convolution Volterra equation of the type that arises in the SDA class. Unless otherwise stated, all the results hold under Assumptions 1 and 2. We include all statements from the previous sections for clarity. The results presented in this paper concern the analysis of existing methods and a new method that is a variant of an existing method. The results are theoretical and we do not anticipate any direct ethical and societal issues. We believe the results will be used by machine learning practitioners and we encourage them to use it to build a more just, prosperous world. A.1 Homogenized SGD We recall that the diffusion model is given by dXt = 2 dZt 1 To connect these diffusions to SGD on the least squares problem (2.1) f(x)= 1 2 kAx bk2, we will use the singular value decomposition of U VT of A. We order the singular values 1 2 3 in decreasing order. We then let t = VT(Xt ex), where we recall that b = Aex+ . We may do a similar computation with N and conclude that: J(1) = 2 2 2jJ 2 1 '(t) '(s)d s,j In summary, we may express J in terms of N by J(1) = 2 2 2jJ 1 '2(t) N(1) + 22 dh t,jiwith J(0) = EH When (k,n)= k+n and thus '(t)=(1+ t) with (t)= 1+t, the corresponding ODE is precisely bJ(3) The other case is when (k,n)= n, or '(t)=exp( t). We call this the general SDAHB; one recovers SDAHB when 1 =, 2 =0, and = .

artificial intelligence, equation, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.54)

Add feedback

55563844bcd4bba067fe86ac1f008c7e-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 23:25:04 GMT

artificial intelligence, machine learning, nash equilibrium, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

Neural Information Processing SystemsApr-25-2026, 15:44:47 GMT

The early theory of actor-critic methods considered convergence using linear function approximators for the policy and value functions. Recent work has established convergence using neural network approximators with a single hidden layer. In this work we are taking the natural next step and establish convergence using deep neural networks with an arbitrary number of hidden layers, thus closing a gap between theory and practice. We show that actor-critic updates projected on a ball around the initial condition will converge to a neighborhood where the average of the squared gradients is O(1/ m)+O(ϵ), with mbeing the width of the neural network and ϵthe approximation quality of the best critic neural network over the projected set.

artificial intelligence, machine learning, min 2, (16 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

17b598fda495256bef6785c2b76c3217-Paper-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsApr-24-2026, 20:17:09 GMT

artificial intelligence, machine learning, trajectory, (19 more...)

Neural Information Processing Systems

Country: Asia > India (0.15)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

0cddb7c06f1cd518e1efdc0e20b70c31-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 16:10:14 GMT

artificial intelligence, machine learning, meshgraphnet, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Material

Neural Information Processing SystemsApr-24-2026, 15:15:37 GMT

A.1 Data Configuration The inputs to a hydraulic simulation include an elevation map, initial conditions, and the boundary conditions. For a given elevation map, there is an infinite possible combinations of initial and boundary conditions that could potentially realize in future events. It is an interesting question how to automatically configure the most relevant initial and boundary conditions to train on, to get a representation that will be useful in potential future real-world scenarios. We suggest a basic configuration that adequate for the purpose of this paper. These include the water height h Rm m at each pixel and a staggered grid flux q R2 (m 1) (m 1) in each direction x,y.

artificial intelligence, elevation map, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > India (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

092cb13c22d51c22b9035a2b4fe76b00-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 14:14:47 GMT

artificial intelligence, domain adaptation, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

019f8b946a256d9357eadc5ace2c8678-Supplemental.pdf

Neural Information Processing SystemsApr-24-2026, 10:10:26 GMT

artificial intelligence, machine learning, nullnull, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks

Bocchi, Dario, Regimbeau, Theotime, Lucibello, Carlo, Saglietti, Luca, Cammarota, Chiara

arXiv.org Machine LearningApr-6-2026

We analyze the one-pass stochastic gradient descent dynamics of a two-layer neural network with quadratic activations in a teacher--student framework. In the high-dimensional regime, where the input dimension $N$ and the number of samples $M$ diverge at fixed ratio $α= M/N$, and for finite hidden widths $(p,p^*)$ of the student and teacher, respectively, we study the low-dimensional ordinary differential equations that govern the evolution of the student--teacher and student--student overlap matrices. We show that overparameterization ($p>p^*$) only modestly accelerates escape from a plateau of poor generalization by modifying the prefactor of the exponential decay of the loss. We then examine how unconstrained weight norms introduce a continuous rotational symmetry that results in a nontrivial manifold of zero-loss solutions for $p>1$. From this manifold the dynamics consistently selects the closest solution to the random initialization, as enforced by a conserved quantity in the ODEs governing the evolution of the overlaps. Finally, a Hessian analysis of the population-loss landscape confirms that the plateau and the solution manifold correspond to saddles with at least one negative eigenvalue and to marginal minima in the population-loss geometry, respectively.

artificial intelligence, machine learning, matrix, (18 more...)

arXiv.org Machine Learning

2604.03068

Country:

Europe > Italy > Lombardy > Milan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report (0.82)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Filters

Collaborating Authors

initial condition

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Smooth Piecewise Cutting for Neural Operator to Handle Discontinuities and Sharp Transitions

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models Supplementary material

55563844bcd4bba067fe86ac1f008c7e-Supplemental.pdf

Convergence of Actor-Critic Methods with Multi-Layer Neural Networks

17b598fda495256bef6785c2b76c3217-Paper-Datasets_and_Benchmarks.pdf

0cddb7c06f1cd518e1efdc0e20b70c31-Supplemental.pdf

Material

092cb13c22d51c22b9035a2b4fe76b00-Paper.pdf

019f8b946a256d9357eadc5ace2c8678-Supplemental.pdf

Escape dynamics and implicit bias of one-pass SGD in overparameterized quadratic networks